Corpus: rus-tj_web_2015_100K

Other corpora

4.4.1.5 Number of Word-N-grams at Sentence Endings

Number of word-N-grams for N=1...5 for the first K sentences

K # of words # of bigrams # of trigrams # of 4-grams # of 5-grams
100 93 95 95 96 96
1000 807 923 958 976 978
10000 6653 8872 9318 9464 9538
100000 39964 77603 90162 94153 95459
1000000 39965 77604 90163 94154 95460


Zipf's diagram for sentence endings


Gnuplot diagram

9601 msec needed at 2020-06-22 00:24